Picture for Faguo Wu

Faguo Wu

PACT: Learning Diverse Diagnostic Strategies via Privileged Synthesis and Branch Consensus

Add code
Jun 08, 2026
Viaarxiv icon

METRO: Towards Strategy Induction from Expert Dialogue Transcripts for Non-collaborative Dialogues

Add code
Apr 14, 2026
Viaarxiv icon

Mosaic: Multimodal Jailbreak against Closed-Source VLMs via Multi-View Ensemble Optimization

Add code
Apr 10, 2026
Viaarxiv icon

Z-Erase: Enabling Concept Erasure in Single-Stream Diffusion Transformers

Add code
Mar 26, 2026
Viaarxiv icon

Towards Compositional Generalization in LLMs for Smart Contract Security: A Case Study on Reentrancy Vulnerabilities

Add code
Jan 11, 2026
Viaarxiv icon

Offline RL with Smooth OOD Generalization in Convex Hull and its Neighborhood

Add code
Jun 10, 2025
Viaarxiv icon

Preference-Guided Reinforcement Learning for Efficient Exploration

Add code
Jul 09, 2024
Figure 1 for Preference-Guided Reinforcement Learning for Efficient Exploration
Figure 2 for Preference-Guided Reinforcement Learning for Efficient Exploration
Figure 3 for Preference-Guided Reinforcement Learning for Efficient Exploration
Figure 4 for Preference-Guided Reinforcement Learning for Efficient Exploration
Viaarxiv icon

Learning Diverse Policies with Soft Self-Generated Guidance

Add code
Feb 07, 2024
Figure 1 for Learning Diverse Policies with Soft Self-Generated Guidance
Figure 2 for Learning Diverse Policies with Soft Self-Generated Guidance
Figure 3 for Learning Diverse Policies with Soft Self-Generated Guidance
Figure 4 for Learning Diverse Policies with Soft Self-Generated Guidance
Viaarxiv icon

Trajectory-Oriented Policy Optimization with Sparse Rewards

Add code
Jan 04, 2024
Figure 1 for Trajectory-Oriented Policy Optimization with Sparse Rewards
Figure 2 for Trajectory-Oriented Policy Optimization with Sparse Rewards
Figure 3 for Trajectory-Oriented Policy Optimization with Sparse Rewards
Figure 4 for Trajectory-Oriented Policy Optimization with Sparse Rewards
Viaarxiv icon

Policy Optimization with Smooth Guidance Rewards Learned from Sparse-Reward Demonstrations

Add code
Dec 30, 2023
Figure 1 for Policy Optimization with Smooth Guidance Rewards Learned from Sparse-Reward Demonstrations
Figure 2 for Policy Optimization with Smooth Guidance Rewards Learned from Sparse-Reward Demonstrations
Figure 3 for Policy Optimization with Smooth Guidance Rewards Learned from Sparse-Reward Demonstrations
Figure 4 for Policy Optimization with Smooth Guidance Rewards Learned from Sparse-Reward Demonstrations
Viaarxiv icon